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Abstract 

We complement our previous work | Kropff and Treves, 2007 with the full (non diluted) solution 
describing the stable states of an attractor network that stores correlated patterns of activity. The new 
solution provides a good fit of simulations of a network storing the feature norms of McRae and colleagues 
|McRae et al., 2005| , experimentally obtained combinations of features representing concepts in semantic 
memory. We discuss three ways to improve the storage capacity of the network: adding uninformative 
neurons, removing informative neurons and introducing popularity-modulated hebbian learning. We 
show that if the strength of synapses is modulated by an exponential decay of the popularity of the 
pre-synaptic neuron, any distribution of patterns can be stored and retrieved with approximately an 
optimal storage capacity - i.e, C m %n oc Ifp, the minimum number of connections per neuron needed to 
sustain the retrieval of a pattern is proportional to the information content of the pattern multiplied by 
the number of patterns stored in the network. 

1 Introduction 

Autoassociative memory networks can store patterns of neural activity by modifying the synaptic weights 
that inter-connect neurons |Hopfiel d, 1982| [Amit, 1989] , following the Hebbian rule |Hebb, 19 49|. Once a 
pattern of activity is stored, it becomes an attractor of the dynamics of the system. Direct evidence showing 
attractor behavior in the hippocampus of in vivo animals has been reported [Wills et al., 2005] . These kind 
of memory systems have been proposed to be present at all levels along the cortex of higher order brains, 
where hebbian plasticity plays a major role. 

Most models of autoassociative memory studied in literature store patterns that are obtained from some 
random distribution. Some exceptions appeared during the 80's when interest grew around the storage of 
patterns derived from hierarchical trees fParga and Virasoro, 1986t|Gutfreund, 198 8 j. Of particular interest, 
Virasoro |Virasoro, 1988| relates the behavior of networks of general architecture with prosopagnosia, an 
impairment that impedes a patient to individuate certain stimuli without affecting its capacity to categorize 
them. Interestingly, the results from this model indicate that prosopagnosia is not present in Hebbian- 
plasticity derived networks. Some other developments have used perceptron-like or other arbitrary local rules 
for storing generally correlated patterns |Gardner et al., 19 89, Diederic h and Qpper, 1987| or patterns with 
spatial correlation |Monasson, 1992 1. More recently, Tsodyks and collaborators [Blumcnfcl d et al., 2006| have 
studied a Hopfield memory in which a sequence of morphs between two uncorrelated patterns are stored. In 
this work, the use of a saliency function favouring unexpected over expected patterns during learning results 
in the formation of a continuous one-dimensional attractor that spans the space between the two original 
memories. The fusion of basins of attraction can be an interesting phenomenon that we are not going to 
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treat in this work, since we assume that the elements stored in a memory such as the semantic one are 
differentiable by construction. 

Feature norms are a way to get an insight on how semantic information is organized in the human brain 
|Vinson and Vigliocco, 2002"! |Garrard et al., 2001] |McRae et al., 2005] . The information is collected by ask- 
ing different types of questions about particular concepts to a large population of subjects. Representations 
of the concepts are obtained in terms of the features that appear more often in the subjects' descriptions. 
In this work we analyze the feature norms of McRae and colleagues |McRae et al., 2005] for two reasons: 
they are public and the size of the dataset allows a statistical approach (it includes 541 concepts described 
in terms of 2526 features). The norms were downloaded from the Psychonomic Society Archive of Norms, 
Stimuli, and Data web site (www.psychonomic.org/archive) with consent of the authors. 

In section [2] we define a simple binary associative network, showing how it can be modified in order 
to store correlated representations. In section [3] we solve the equilibrium equations for the stable attractor 
states of the system using a self-consistent signal to noise approach. Finally, in section[4]we study the storage 
of the feature norms of McRae and colleagues representing semantic memory elements. 

2 The model 

We assume a network with N neurons and C < N synaptic connections per neuron. If the network stores p 
patterns, the parameter a = p/C is a measure of the memory load normalized by the size of the network. In 
classical models, the equilibrium properties of large enough networks depends on p, C and N only through 
a, which allows the definition of the thermodynamic limit (p — > oo, C — ► oo, N — > oo, a constant). 

The activity of neuron i is described by the variable Cj, with i — 1...N. Each of the p patterns is a 
particular state of activation of the network. The activity of neuron i in pattern /i is described by with 
/i = l...p. The perfect retrieval of pattern \i is thus characterized by cr; = for all i. We will assume binary 
patterns, where = if the neuron is silent and £f = 1 if the neuron fires. Consistently, the activity states 
of neurons will be limited by < <7j < 1 . We will further assume a fraction a of the neurons being activated 
in each pattern. This quantity receives the name of sparseness. 

Each neuron receives C synaptic inputs. To describe the architecture of connections we use a random 
matrix with elements CV, = 1 if a synaptic connection between post-synaptic neuron i and pre-synaptic 
neuron j exists and dj = otherwise, with Cu — for all i. In addition to this, synapses have associated 
weights Jij. 

The influence of the network activity on a given neuron i is represented by the field 

N 

h i = z2 c ij J ij<rj (!) 
which enters a sigmoidal activation function in order to update the activity of the neuron 

o-i = {1 + exp 0(U- hi)}' 1 (2) 

where is inverse to a temperature parameter and U is a threshold favoring silence among neurons 
|Buhmann et al., 1989t|Tsodyks and Feigel'Man, 1988| . 

The learning rule that defines the weights Jij must reflect the Hebbian principle: every pattern in which 
both neurons i and j are active will contribute positively to Jij. In addition to this, the rule must include, 
in order to be optimal, some prior information about pattern statistics. In a one-shot learning paradigm, 
the optimal rule uses the sparseness a as a learning threshold, 

However, as we have shown in previous work |Kropff and T reves, 2007| , in order to store correlated 
patterns this rule must be modified using dj, or the popularity of the pre-synaptic neuron, as a learning 
threshold, 
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This requirement comes from splitting the field into a signal and a noise part, 

N 1 p N 

I * = E ^ (S - CT i + ca E # E f 'v " ^ ( 6 ) 

and, under the hypothesis of gaussian noise, setting the average to zero and minimizing the variance. This 
last is 
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If statistical independence is granted between any two neurons, only the first term in Eq. [7] survives when 
averaging over {£}. 

In Figure Q] we show that the rule in Eq. [3] can effectively store uncorrelated patterns taken from the 
distribution 

P(g) = a6($-l) + (l-a)5(&). (8) 

but cannot handle less trivial distributions of patterns, suffering a storage collapse. The storage capacity can 
be brought back to normal by using the learning rule in Eq. 01 which is also suitable for storing uncorrelated 
patterns. 

Having defined the optimal model for the storage of correlated memories, we analyze in the following 
sections the storage properties and its consequences through mean field equations. 



3 Self consistent analysis for the stability of retrieval 

We now proceed to derive the equations for the stability of retrieval, similarly to what we have done in 
Kro pff and Treves, 2007| but in a network with an arbitrary level of random connectivity, where the approx- 
imation C <€. N is no longer valid |Shiino and Fukai, 1 992, Shii no and Fukai, 1993[|Roudi and Treves, 2 004|. 
Furthermore, we introduce patterns with variable mean activation, given by 

1 N 

i=i 

for a generic pattern [i. As a result of this, the optimal weights are given by 
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Figure 1: The four combinations of two learning rules and two types of dataset. Green: one shot 'standard' 
learning rule of Eq. [3j Orange: modified rule of Eq. [4] Solid: trivial distribution of randomly correlated 
patterns obtained from Eq. [8] Dashed: non-trivially correlated patterns obtained using a hierarchical 
algorithm. In three cases, the storage capacity (the maximum number of retrievable patterns normalized by 
C) with C (the number of connections per neuron) is finite and converges to a common value as C increases. 
Only in the case of one-shot learning of correlated patterns there is a storage collapse. 



which ensures that patterns with different overall activity will have not only a similar noise but also a similar 
signal. In addition, we have introduced a factor gj = g[aj) in the weights that may depend on the popularity 
of the pre-synaptic neuron. We will consider gj — 1 for all but the last section of this work. 

If the generic pattern 1 is being retrieved, the field in Eq. [T]for neuron i can be written as a signal and 
a noise contribution 

with 

1 N 

m i = -q^- Yl Sj^j (£j - a J>i- ( 12 ) 

M 3 = 1 

We hypothesize that in a stable situation the second term in EgfTTj the noise, can be decomposed into two 
contributions 

£i m i = + P iZl - ( 13 ) 

The second term in Eq. [13] represents a gaussian noise with standard deviation pi, and a random variable 
taken from a normal distribution of unitary standard deviation. The first term is proportional to the activity 
of the neuron i and results from closed synaptic loops that propagate this activity through the network back 
to the original neuron, as shown in |Roudi and Treves, 2004| . As is typical in the self consistent method, we 
will proceed to estimate mf from the ansatz in Eq. [131 inserting it into Eq. Q1] and validating the result 
with, again, Eq. [13j checking the consistency of the ansatz. 

Since Eq. [13] is a sum of p — ■> oo microscopic terms, we can take a single term v out and assume that the 
sum changes only to a negligible extent. In this way, the field becomes 

hi ~ ilm\ + + m + p iZi . (14) 
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If the network has reached stability, which we assume, updating neuron i does not affect its state. This can 
be expressed by inserting the field into Eq. [21 

at = {1 + exp(-P(hi ~ U))}- 1 = G [$ml + tf< + PA ] . (15) 

In the RHS of Eq. [I5]the contribution of 7^ to the field has been reabsorbed into the definition of G[x]. 
At first order in Zjirij, Eq. [15] corresponding to neuron j can be written as 

~ G [i]m] + PjZj] + G' [C'ffij + PjZj] £'//;/''. (16) 

To simplify the notation we will further use Gj = G [^m^ + PjZj] and Gj = G' [^jirij + PjZj] . To this order 
of approximation, Eq. [121 becomes 



^rX^,,!^; «,.{<;, ■ c^m';}. (17) 



tA 1 _ 

p j=i 



Other terms of the same order in the Taylor expansion could have been introduced in Eq. [16j corresponding 
to the derivatives of G with respect to ^jirij for p ^ v. It is possible to show, however, that such terms give 
a negligible contribution to the field. 



If we define 
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Eq. [17] can be simply expressed as 



3=1 

This equation can be applied recurrently to itself renaming indexes, 

N N N 

mf = l? + x: ^ + EE ( 2 °) 

3=1 J=l fc=l 

If applied recurrently infinite times, this procedure results in 

N N N 

' T, K V L i ' Y^l K ^ K ^ Li i ■■ ( 21 ) 

j=i j=i fe=i 

which, by exchanging mute variables, can be re-written as 

N ( N N "I 

mf = if + £ Jfg + K&K + E *T^W + ■■}■ (22) 

j=l [ fc=l k,l=l J 

Eq. [22] can be decomposed into the contribution of the activity of G; on one side and that of the rest of 
the neurons on the other, which will correspond to the first and the second term on the RHS of Eq. [T3] To 
re-obtain this equation we multiply by £f and sum over p, using the definition of Lf from Eqs. [18] 
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Let us first treat the first term of Eq. [231 corresponding to 7^0"; in Eq. [T3l Taking into account 
that cu — (no self-excitation), only the contribution including the curly brackets survives. As shown in 
|Roudi and Treves, 2004| , each term inside the curly brackets, containing the product of multiple K's, is 
different only to a vanishing order from the product of independent averages, each one corresponding to the 
sum of K a b over all pre-synaptic neurons b. In this way, 
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where we have introduced a = p/C, or the memory load normalized by the number of connections per 
neuron. The (. . .) brackets symbolize an average over the index fi and f) is a variable of order 1 defined by 

1 N 

— E",'l " j (25) 
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Adding up all the terms with different powers of fl in Eq. [24] results in 
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Since ft does not depend on fi, if e? M = a for all n the average results simply in the classical — factor. 

As postulated in the ansatz, the second term in Eq. [23] is a sum of many independent contributions and 
can thus be thought of as a gaussian noise. Its mean is zero by virtue of the factor (£[" — a/), uncorrected 
with both £f (by hypothesis) and (negligible correlation). Its variance is given by 
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(27) 



which corresponds to the first and only surviving term of Eq. the other three terms vanishing for identical 
reasons. Distributing the square in the big parenthesis and repeating the steps of Eq. [24] this results in 
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If we define 



1 N 

^{■■■y^^G^il-aOgf (29) 
1=1 

including the whole content of the curly brackets from the previous equation, then the variance of the 
gaussian noise is simply aatq, and the second term of Eq. [13] becomes 

PiZi = ^/aaiqzi (30) 

with Zi, as before, an independent normally-distributed random variable with unitary variance. The initial 
hypothesis of Eq. [13] is, thus, self consistent. 

Taking into account these two contributions, the mean field experienced by a neuron i when retrieving 
pattern 1 is 
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where we have used m} ~ to and 
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(32) 



is a variable measuring the weighted overlap between the state of the network and the pattern 1, which 
together with q (Eq. 129)1 and fl (Eq. [25)1 form the group of macroscopic variables describing the possible 
stable states of the system. While to is a variable related to the signal that pushes the activity toward 
the attractor, q and f2 are noise variables. Diluted connectivity is enough to make the contribution of f2 
negligible (in which case the diluted equations |Kropff and Treves, 2007| are re-obtained), while q gives a 
relevant contribution as long as the memory load is significantly different from zero, a — p/C > 0. 

To simplify the analysis we adopt the zero temperature limit ((3 — * oo), which turns the sigmoidal function 
of Eq. [2] into a step function. To obtain the mean activation value of neuron i, the field hi defined by Eq. 
[31] must be inserted into Eq. [2] and the equation in the variable solved. This equation is 
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(33) 



where Q[x] is the Heaviside function yielding 1 if x > and otherwise. When Zi has a large enough 
modulus, its sign determines one of the possible solutions, <7j = 1 or <7j = 0. However, for a restricted range 
of values, z_ < Zi < z + , both solutions are possible. Using the definition of ji in Eq. [26]to simplify notation, 
we can write z+ = {U — £}m) / \jaqai and Z- = (U — — 7i))/%/ a 9 a «- A sort of Maxwell rule must be 
applied to choose between the two possible solutions |Sh iino and Fukai, 1993| , by virtue of which the point 
of transition between the Oi = and the Oi = 1 solutions is the average between the two extremes 



Inserting Eq. [33] into Eq. 
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where we have introduced the average over the independent normal distribution Dz for Zj, This expression 
can be integrated resulting in 
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where we define 



1 + erf 



V2. 



(36) 



(37) 



Following the same procedure, Eq. [29] can be rewritten as 
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Before repeating these steps for the variable f2 we note that 
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where we have applied integration by parts. Eq. 
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Eqs. [36l [38] and [40] define the stable states of the network. Retrieval is successful if the stable value 
of m is close to 1. In Figure [2] we show the performance of a fully connected network storing the feature 
norms of Mc Rae and colleagues [McRae et al., 2005] in three situations: theoretical prediction for a diluted 
network as in [Kropff and Treves, 2007] , theoretical prediction for a fully connected network calculated from 
Eqs. 13611401 and the actual simulations of the network. The figure shows that the fully connected theory 
better approximates the simulations, performed with random subgroups of patterns of varying size p and full 
connectivity for each neuron, C = N, equal to the total number of features involved in the representation of 
the subgroup of concepts. 
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Figure 2: Simulations and numerical solutions of the equations of a network storing random subgroups of 
patterns taken from the feature norms of McRae and colleagues. The performance of the network depends 
strongly on the size of the subgroup. Though this is observed in the highly diluted approximation, the decay 
in performance is not enough to explain the data. It is the full solution with g(x) — 1 that results in a good 
fit of the simulations. In each simulation, the number of neurons equals the number of features describing 
some of the stored concepts, and there is full connectivity between neurons, C — N. 

Finally, we can rewrite Eqs. 13611401 in a continuous way by introducing two types of popularity distribution 
across neurons: 

F(x) = P(a z = ar) (41) 

as the global distribution, and 

f(x) = P(a t = x\£ = 1) (42) 

as the distribution related to the pattern that is being retrieved. 

The equations describing the stable values of the variables become 
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4 The storage of feature norms 



In |Kropff and Treves, 20Q7| we have shown that the robustness of a memory in a highly diluted network is 
inversely related to the information it carries. More specifically, a stored memory needs a minimum number 
of connections per neuron C m i n that is proportional to 



If = / f(x)x(l - x)dx. (45) 
Jo 

In this way, if connections are randomly damaged in a network, the most informative memories are selectively 
lost. 

The distribution F(x) affects the retrievability of all memories. As we have shown in the same paper, it 
is typically a function with a maximum near x = 0. The relevant characteristic of F(x) is its tail for large 
x. If F(x) decays fast enough, the minimal connectivity scales like 
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(46) 



where If corresponds to the same pseudo- information function as in Eq. but using the distribution F(x). 
If F(x) decays exponentially (F(x) ~ exp(— x/a)), the scaling of the minimal connectivity is the same, with 
only a different logarithmic correction, 
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The big difference appears when F(x) has a tail that decays as slow as a power law (F(x) 
minimal connectivity is then much larger 
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5G-T). The 

(48) 



since the sparseness, measuring the global activity of the network, is in cortical networks a <C 1. Unfortu- 
nately, as can be seen in Figure 02 the distribution of popularity F(x) for the feature norms of McRae and 
colleagues is of this last type. This is the reason why, as shown in Figure [21 the performance of the network 
is very poor in storing and retrieving patterns taken from this dataset. In a fully connected network as the 
one shown in the figure, a stored pattern can be retrieved as long as its minimal connectivity C m in < N, 
the number of connections per neuron. Along the x axis of the Figure, representing the number of patterns 
from the norms stored in the network, the average of If is rather constant, p and N increase proportionally 
and a decreases, eventually taking C m i n over the full connectivity limit. 

In the following subsections, we analyze different ways to increase this poor storage capacity and effectively 
store and retrieve the feature norms in an autoassociative memory. 
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Figure 3: The popularity distribution F(x) of the feature norms is a power law, with 7 ~ 2.16. Note that 
both axes are logarithmic. In the inset, the same plot appears with linear axes, including the corresponding 
fit. 

4.1 Adding uninformative neurons 

As discussed in jKropff and Treves, 2007| , a way to increase the storage capacity of the network in general 
terms is to push the distribution F(x) toward the smaller values of x. One possibility is to add neurons 
with low information value (i.e. with low popularity) so as to make If smaller in average without affecting 
the sparseness a too much. In Figure [4^ we show that the full set of patterns from the feature norms can 
be stored and retrieved if 5 new neurons per pattern are added, active in that particular pattern and in no 
other one. 

4.2 Removing informative neurons 

A similar effect on the distribution F(x) can be obtained by eliminating selectively the most informative 
neurons. In Figure HJd we show that if the full set of patterns is stored a retrieval performance of ~ 80% is 
achieved if the 40 more informative features are eliminated. We estimate that 100% performance should be 
achieved if around 60 neurons were selectively eliminated. 

It is not common in the neural literature to find a poor performance that is improved by damaging the 
network. This must be interpreted in the following way. The connectivity of the network is not enough to sus- 
tain the retrieval of the stored patterns, too informative to be stable states of the system. By throwing away 
information, the system can be brought back to work. However, a price is being payed: the representations 
are impoverished since they no longer contain the most informative features. 

4.3 Popularity-modulated weights 

A final way to push the distribution F(x) toward low values of x can be figured from Eqs. |43l Indeed, g{x) 
can be thought of as a modulator of the distributions F(x) and f(x). Inspired in [Kropff and Treves7"2 007|, 
if g(x) decays exponentially or faster, the storage capacity of a set of patterns with any decaying F(x) 
distribution should be brought back to a C m i n oc plf dependence, without the a -1 » 1 factor. 



10 




3000 4000 5000 6000 2490 2500 

Total number of neurons 



2510 



2520 



Figure 4: Adding or taking neurons affects the overall distribution F(x) and, thus, the performance of the 
network. The starting point for both situations is 2526 neurons corresponding to all the features in the 
norms, a: Adding 5 neurons with minimal popularity per pattern is enough to get 100% performance. Note 
that the transition is sharp, b: Removing the 40 most informative neurons also results in an improved 
performance, in this case of 80% of the stored patterns. 



In Figure [5] we analyze two possible g(x) functions that favor low over high values of x: 



92{x) 

The storage capacity of the network increases drastically in both cases. Furthermore, we estimate that 
~ 60% of the lost memories in the figure suffer from a too high value of the threshold U, set, as in all 
simulations in the paper, to 0.6. This value was chosen to maximize the performance in the previous 
simulations. However, with a much more controlled noise, the optimal threshold should be lower, generally 
around m/2. Setting the threshold in this level could maybe improve even further the performance of the 
network. 
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5 Discussion 

We have presented the full non diluted solution describing the stable states of a network that stores correlated 
patterns. A simple Hebbian learning rule is applicable as long as neurons can be treated as statistically 
independent. In order to analyze the storage of the patterns taken from the feature norms of McRae 
and colleagues, we include in the learning rule the possibility that the global activity is different for each 
pattern. The full solution explains the poor performance of autoassociative networks storing the feature 
norms |McRae et al., 19971 |Cree et al., 1999[ |Cree et al., 2 006 1. We show that this data has a popularity 
distribution decaying as a power law, the worse of the cases analyzed in Kropff and Treves, 20 07|. 

The three proposed solutions aiming to improve the storage capacity of the network have a very different 
scope. Adding unpopular neurons is a feasible solution for McRae and colleagues. In the procedure of 
collecting the norms, a threshold is used to decide whethter or not a given feature is relevant enough to 
be included in the dataset. Lowering the threshold would result in a set of patterns with many more very 
uninformative features. In second place, the elimination of very informative neurons in a damaged network 
could be achieved by damaging selectively the most active ones, bringing back the network to work. Finally, 
the modulation of synaptic strength following pre-synaptic popularity can be considered to be an intermediate 
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Figure 5: Simulations (dashed line) and theoretical predictions (solid line) of a network storing subgroups 
of patterns of varying size taken from McRae and colleagues feature norms with a popularity-modulated 
hebbian learning rule. The thin violet lines use a value of g(x) inversely proportional to x(l — x), normalized 
so as to maintain the average field of order 1. The thick green line corresponds to a g{x) inversely proportional 
to y/x(l — x). Following our predictions, the exact form of g(x) does not affect the general performance, 
which is substantialy improved with respect to the simulations with g(x) — 1, copied from Figure [3] in grey 
dots. 



solution between the two extremes. Whether or not it is a cortical strategy applied to deal with correlated 
representations is a question for which we have yet no experimental evidence. 
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